These are a couple of debugging notes that may be helpful for anyone developing Gumbo or trying to diagnose a tricky problem. They will probably not be necessary for normal clients of this library -\/ Gumbo is relatively stable, and bugs are often rare and obscure. However, they\textquotesingle{}re handy to have as a reference, and may also provide useful Google fodder to people searching for these tools.

Standard disclaimer\+: I use all of these techniques on my Ubuntu 14.\+04 computer with gcc 4.\+8.\+2, clang 3.\+4, and gtest 1.\+6.\+0, but make no warranty about them working on other systems. In particular, they\textquotesingle{}re almost certain not to work on Windows.

\section*{Debug output }

Gumbo has a compile-\/time switch to dump lots of debug output onto stdout. Compile with the G\+U\+M\+B\+O\+\_\+\+D\+E\+B\+UG define enabled\+:


\begin{DoxyCode}
$ make CFLAGS='-DGUMBO\_DEBUG'
\end{DoxyCode}


Note that this spits {\itshape a lot} of debug information to the console and makes the program run significantly slower, so it\textquotesingle{}s usually helpful to isolate only the specific H\+T\+ML file or fragment that causes the bug. It lets us trace the operation of each of the tokenizer \& parser\textquotesingle{}s state machines in depth, though.

\section*{Unit tests }

As mentioned in the R\+E\+A\+D\+ME, Gumbo relies on \href{https://code.google.com/p/googletest/}{\tt googletest} for unit tests. Unzip the gtest Z\+IP distribution inside the Gumbo root and rename it \textquotesingle{}gtest\textquotesingle{}. \textquotesingle{}make check\textquotesingle{} runs the tests, as normal.


\begin{DoxyCode}
$ make check
$ cat test-suite.log
\end{DoxyCode}


If you need to debug a core dump, you\textquotesingle{}ll probably want to run the test binary directly\+:


\begin{DoxyCode}
$ ulimit -c unlimited
$ make check
$ .libs/lt-gumbo\_test
$ gdb .libs/lt-gumbo\_test core
\end{DoxyCode}


The same goes for core dumps in other example binaries.

To run only a single unit test, pass the --gtest\+\_\+filter=\textquotesingle{}Test\+Name\textquotesingle{} flag to the lt-\/gumbo\+\_\+test binary.

\section*{Assertions }

Gumbo relies pretty heavily on assertions. By default they\textquotesingle{}re enabled at run-\/time\+: to turn them off, define N\+D\+E\+B\+UG\+:


\begin{DoxyCode}
$ make CFLAGS='-DNDEBUG'
\end{DoxyCode}


\section*{A\+S\+AN }

Google\textquotesingle{}s \href{https://code.google.com/p/address-sanitizer/}{\tt address-\/sanitizer} is a helpful tool that lets you find memory errors with relatively low overhead\+: enough that you can often run it in production. Enabling it for C/\+C++ binaries is pretty standard and described on the A\+S\+AN documentation pages. It requires Clang $>$=3.\+1 or G\+CC $>$= 4.\+8.


\begin{DoxyCode}
$ make \(\backslash\)
    CFLAGS='-fsanitize=address -fno-omit-frame-pointer -fno-inline' \(\backslash\)
    LDFLAGS='-fsanitize=address'
\end{DoxyCode}


A\+S\+AN can also be used when Gumbo is compiled as a shared library and linked into a scripting language via F\+FI, but this use-\/case is unsupported by the A\+S\+AN authors. To do it, use L\+D\+\_\+\+P\+R\+E\+L\+O\+AD to ensure the A\+S\+AN runtime support is included in the process\+:


\begin{DoxyCode}
$ LD\_PRELOAD=libasan.so.0 python -c 'import gumbo; gumbo.parse(problem\_text)'
\end{DoxyCode}


Getting clean stack traces from this requires the use of the llvm-\/symbolizer binary, included with clang\+:


\begin{DoxyCode}
$ export ASAN\_SYMBOLIZER\_PATH=/usr/bin/llvm-symbolizer-3.4
$ export ASAN\_OPTIONS=symbolize=1
$ LD\_PRELOAD=libasan.so.0 python -c \(\backslash\)
  'import gumbo; gumbo.parse(problem\_text)' 2>&1 | head -100
$ killall llvm-symbolizer-3.4
$ killall llvm-symbolizer-3.4
$ killall llvm-symbolizer-3.4
\end{DoxyCode}


This use case is even less officially supported than using it with dynamic shared objects; on my machine, it led to a recursive A\+S\+AN error about a use-\/after-\/free in llvm-\/symbolizer, effectively fork-\/bombing the machine. Have the killalls ready, and avoid letting the process run for too long (eg. piping it to \textquotesingle{}less\textquotesingle{}). 